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"Motherese" of Mr. Rogers 
Abstract 

Dialogue from 30~minute samples each from Sesame Street 
c»nd Mr> Rogers^ Neighborhood was described. Three aspects 
of la>.guage were measured: grammar, content, and 
discourse. The findings indicate that the dialogue of 
these programs is well suited to young viewers, with 
adjustments similar to those evident in adults' speech to 
young children. The mean length of utterance is 
comparable to that of adults in interactions with 
children, the ratio of different words to total words is 
the same as that of young children's language, sentence 
structure is simplified, and there is a heavy emphasis on 
the here and now (a majority of present tanse verbs, a 
high proportion of utterances about immediately visible 
topics or referents, and a preponderance of event casts as 
narrative structure). There are repeated instances of 
linguistic emphasis, with frequent repetition of key 
terms. Both programs avoid complex word forms. Overall, 
the dialogue of educational children's programs follows 
the constraints and adjustments evident in adults' child- 
directed language. 
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Introduction 

dyadic interactions between adults and children have 
been widely recognized as a source of linguistic input 
that is well suited to children's language acquisition. 
Adults tend to simplify their talk to children in a manner 
that has come to be known as "motherese." Among the 
features of monherese are an emphasis on the here and now, 
with a restricted vocabulary and much paraphrasing; 
simple, well-formed sentences; frequent repetitions; and a 
slow rate of speech with long pauses between utterances 
and after content words (of. Owens, 1984, p. 224). An 
extensive literature has explored the implications of the 
motherese register and how it may contribute to 
children's language acquisition (e.g., Hof f-Ginsburg & 
Shatz, 1982). The current conclusion is that the 
simplified register is probably f acilit&tive, although no't 
necessary, for language acquisition (Snow, 1984). 

Live interactions with adults are not the sole source 
of linguistic input for young children in Western 
technologically advanced societies. Youngsters receive 
large amounts of exposure to the mass communication media. 
Children in the United States spend more time watching 
television than they do in school, in social interaction 
with other family members, or in any other waking activity 
(Singer, 1983). Children begin viewing during the 
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language acquisition period of development. Infants 
respond to the sights and sounds of TV (Hollenbeck & 
Slaby, 1979). Children between 1 and 2 years of age begin 
to react to particular characters and events on TV by 
pointing, labeling, and selective attention (Lemish, in 
press). By age 3 years, American children are regular 
viewers, averaging more than 2 1/2 hours of viewing daily 
(Huston, Wright, Kerkman, Seigle, Rice, & Bremer, 1983). 
Furthermore, young children's viewing is attentive. In 
the home situation, when the TV is on, children increase 
the percentage of time looking at the screen from 6% at 
age 1, to 40% at age 2, 67% at age 3-4, and 70% for 5- to 
6-year-olds (Anderson, 1983). While they view, they 
hear an extensive amount of dialogue. Insofar as children 
view frequently and attentively, the medium is potentially 
a major source of verbal information for children at the 
ages of rapid language acquisition. 

The dialogue of television has been dismissed as 
inappropriate for young children, because it is alleged 
that "on television, people larely talk about things 
immediately accessible to view for the audience . . . they 
(children) hear rapid speech that cannot easily be linked 
to familiar situations" (Clark & Clark 1977, p. 330). 
That characterization was not completely supported in a 
descriptive study of the dialogue of television programs 
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(Rice, 198A). In particular, the educational programs 
sampled (Htj_ Rogers' Neighborhood and Electric Company ^ 
emphasized and simplified dialogue in a manner much like 
motherese: slow rate, low rate of dysf luencies, 
grammatical completeness, immediacy of reference, frequent 
rephrasings and emphasis of key words, and avoidance of 
nonliteral word meanings. 

The earlier study (Rice, 1984) is limited by a small 
sample size. Short bits (6 1/2 minutes) were selected 
from six different programs, representing educational 
programs, cartoons, and adult situation comedies. The 
programs' non 1 inguistic production features as well as 
linguistic features were described. Given the findings 
suggesting that educational programs for young children 
simplify dialogue to correspond to young viewers' language 
competencies, it is of interest to determine if that 
finding can be replicated with a more extensive sample of 
educational programming. 

It is the purpose of this study to describe the 
dialogue of samples of the two most popular educational 
programs for preschool children, Mr. Rogers' Neighborhood 
and Sesame Street, hereafter referred to as MR and SS. 
These two programs are broadcast nationally on public 
television. MR is aimed at children ages 2 to 4, and SS 
is aimed at children 3 to 6. MR emphasizes affective 
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content, whereas SS focuses more heavily on cognitive 
skills. They are widely viewed. For example, for one 
viewing week in 1983, an estimated 10.4 million American 
households tuned in to SS, and an estimated 5.5 million 
households viewed MR (Palmer, 1984, p. 117). SS is the 
most popular program of preschoolers, with 3-year-old 
children averaging 3 hours per week of SS viewing (Huston, 
Wright, Eakins, Kerkman, Pinon, Rosenkoetter , & Truglio, 
1985). 
Procedures 

Stimulus selection . Four hours of broadcast 
programming for MR and SS were dubbed off the air in June 
1984. From this 4-hour sample for each program, a 30- 
minute stimulus videotape was edited for each program. 
The bits were selected to meet the following criteria: 
they did not contain singing and extended rhyming, and 
they were judged by two adult viewers as representative of 
the overall content of the 4-hour sample. The SS sample 
consisted of 10 individual bits with an average bit length 
of 2.9 minutes. The MR sample included 8 bits with an 
average bit length of 3.9 minutes. Bits were defined as 
within topic discussions by the same characters, on the 
same set. Bit boundaries were established by consensus of 
agreement between the two experimenters. 
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Transcription. The two stimulus videotapes were 
transcribed verbatim by one of the experimenters. A 
second transcriber, a graduate student, checked the 
transcripts for accuracy. Agreement was high for both 
samples, at the 99% level. 

Coding. The transcripts were coded for three aspects 
of verbal communication: grammar, content, and discourse. 
The grammatical analysis was completed using the LINGQUEST 
computer-assisted language assessment program (Mordecai, 
Palen, & Palmer 1982). Following the LINGQUEST protocol, 
the following were deleted from analysis: incomplete 
sentences, repetitions, and vocatives. In addition, 
syntactically unstructured elements were deleted, 
following the conventions of Barnes, Gutfreund, Satterly 
and Wells, 1983. They include greetings (hi, bye), 
politeness phrases (thank you), conversational fillers 
(yes, good, elipitical diectic terms such as there), 
sentence starters (now, and so) and exclamations (hah, oh 
no). The LINGQUEST program requires preliminary coding of 
nouns, certain verbs, gerunds, participles, and particles. 
The two experimenters coded each transcript individually, 
then resolved differences by consensus. 

The content coding was based on the categories 
developed for the Rice (1984) study. It consists of 
counts of the following categories: immediacy, emphasis. 
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nonliteral meanings, novel words, and explicit 
instructions regarding how the viewing audience is 
supposed to interpret content. Immediacy involved coding 
comments according to the presence or absence of referents 
(referent immed^.ately present on screen, removed from 
sight, or nonreferential comments). Emphasis was defined 
as a means of giving selected prominence to a linguistic 
constituent for some sort of communicative purpose. It 
could be accomplished by one or more of the following 
linguistic devices: syntactic/pragmatic operations, such 

as "It is "This is a stress; 

repetition; recasting in different linguistic contexts, 
involving a partial or complete repetition of a particular 
linguistic form in a new communicative and/or linguistic 
context. An example of recasting is: 

Mr. Rogers: Just very fine dust. 

It's wood dust, isn't it? 
Bob: Wood dust is right. 
Mr. Rogers: Dust that comes from the wood. 

Bob: Sometimes you get big curls of wood. 
Nonliteral meanings included metaphors and puns. Novel 
words are those made up for the occasion, such as a doctor 
who makes cave calls. 

Coding of the content categories was done by the two 
experimenters individually. Reliability was calculated by 
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dividing the number of agreements by the total number of 
agreements and disagreements. Reliability for coding 
emphasis was 80% and for immediacy it was 83%. Reliability 
was not calculated for nonliteral meanings, explicit 
instructions, or novel words because of very low 
frequencies within these categories. Differences were 
almost always due to oversight, and were resolved by 
consensus between the two coders. 

The discourse categories were four types of 
narratives proposed by Heath and Branscombe (in press): 
recounts, accounts, event casts, and stories. Recounts are 
retellings in which information is known to both the 
teller and the listener. Accounts are narratives 
generated by either the teller or another party to provide 
new information or new interpretations of information 
which may already be known to both the teller and the 
listener. An event cast is a running narrative on events 
currently in the attention of the teller and listeners. 
This narrative may be simultaneous with the events or 
precede them. Stories include an animate being who moves 
through a series of events with goal-directed behavior. 
Each bit was categorized according to the dominant 
narrative type. The two experimenters coded the bits 
independently. Agreement was 100%. 
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Results and Discussion 

Gramgar. The LINGQUEST analysis generated the 
following variables for each bit: mean length of 
utterance in words (MLU), type/token raL o, total number 
of words, total number of utterances, percentage of 
present, past, and future tense verbs, four different 
categories of sentence types, and three different 
categories of questions. The results are presented in 
Table 1 where they are reported as bit means. 



Insert Table 1 about here 



The average MLU for an SS bit was 6.91 and for an MR 
bit was 7.42. The observed MLU for MR is comparable to 
the earlier sample, where the MLU in words was 7.21 (Rice, 
1984). The range was relatively restricted, from 5.89 to 
8.00 for SS and 6.23 to 8.42 for MR. The restricted range 
is related in part to the elimination of unstructured 
utterances, such as exclamations and politeness phrases. 
The short utterances that did occur were of the 
unstructured type, although it was possible for short 
structured utterances to have occurred. 

The MLU of the television characters compares 
favorably to observed MLUs of adults talking to children. 
Kindergarten teachers' utterances directed toward their 
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students ranged from an MLU in words of 7.52 to 8.80, in 
contrast to the same teachers' utterance length in 
conversations with their adult colleagues of 11.78 to 
18.48 (Granowsky & Krossner, 1970). Bohannon and Marquis 
(1977) report an MLU in morphemes of 6.43 for unfamiliar 
adults talking to a 3-year-old, compared to an MLU of 
6.95 for the 3-year-old's mother. They report an MLU of 
13.8 for adults talking to adults. Newport, Gleitman and 
Gleitman (1977) obtained mean MLUs in words of 4.24 for 
mothers talking to their 12-to-27-month-old children, vs. 
mean MLUs of 11.94 f.r mothers' speech to the adult 
experimenter. 

The ratio of different words to total words used 
(Type/Token Ratio) was .4S for both programs. Comparative 
data is available in Templin (1957), who reports a ratio 
of .45 for children ages 3-4 years, and a range from .44 
to .47 for yearly increments up to age 8 years. 

The analysis of verb tenses indicates that the 
majority of verbs are in the present tense, 77% for SS ana 
68% for MR. Past and future tenses are less frequent and 
roughly equal in probability. The majority of utterances 
are phrases or sijiple sentences of the NP + VP (+ NP) or NP 
+ cop + NP structure. For SS, 10% of the utterances were 
phrases and 23% were simple sentences; for MR, 15% 
phrases and 18% simple sentences. An additional 10% of 
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utterances for SS and for MR fell into one of the 
following types: NP + aux + VP, or NP (+ aux) + cat 
(+ VP), or NP + modal (+ aux) + VP (+ NP). Sentences 
with infinitives were infrequent, as were compound 
sentences. The percentage of utterances unidentifiable by 
LINGQUEST (generally more complex structures, such as 
embeddings, complex questions, and complex structure 
combinations), was 27% for SS and 23% for MR. 

Questions were analyzed according to three 
categories: reversals, such as "Are you coming?"; rising 
intonation questions, such as "You want it?"; and 
questions formulated with Wh words, such as "What is 
that?" A total of 80 questions appeared in the SS 
sample, and 67 in MR. For SS, 27% were reversals, 22% 
were rising intonations, and 51% were Wh questions. For 
MR, 69% were reversals, 21% were rising intonations, and 
10% were Wh questions. Reversals are closely related to 
the yes/no »iuestions that Newport et. al. (1977) found to 
be positively associated with children's auxilary 
acquisition. Also, Hof f-Ginsberg (1981) reported that 
the frequency of some Wh questions in mothers* speech 
predicted auxilary growth in their children's speech 

Content . Results of the content coding are reported 
in Table 2, as bit means per show. For the category of 
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Insert Table 2 about here 



immediacy, the majority of utterances for both SS and MR 
were about referents immediately present on the screen, 
with 58% for SS and 63% for MR. This suggests a strong 
focus on the here and now in the programming, especially 
when combined with the earlier finding of a large 
proportion of present tense verbs. 

There were frequent instances of linguistic emphasis 
in both programs. The proportion was .94 for SS and .77 
for MR. This can be Interpreted as almost one instance of 
emphasis per utterance for SS, on the average. The 
measure also indicates the considerable redundancy of 
linguistic forms and associated content that is evident in 
the programs. Key terms appear repeatedly throughout a 
bit, often recast in different linguistic frames. For 
example, in a 4-rainute segment of MR, with a total of 45 
utterances, there were 29 occurrences of the word ball 
(or balls ) . 

Another assist to the viewer is the frequent use of 
proper names as direct addresses between two 
interlocutors. Given the fact that the characters on the 
programs are very familiar with each other, it certainly 
is not necessary for them to use each other's names in 



ERIC 



13 

14 



"Motherese" of Mr. Rogers 

casual conversation. Yet almost always the initial 
appearance of a character is accompanied by one or 
several insertions of the character's name in the opening 
conversational interactions. On the other hand, both 
programs pointedly avoid adult-like complex word forms. 
Nonliteral meanings such as sarcasm, puns, or slang words 
and novel words are rare occurrences. 

Explicit acknowledgement of the home viewer is 
evident in the 3% of SS utterances that were direct 
instructions, and the 17% offered by MR. Examples 
are: "Now tell me when it goes off." "Now tell me which 
one I'm going to put on now." And "Now, which one is this 
one?" The instructions are followed by pauses long enough 
for a response, and usually, but not always, the answer is 
then provided. This technique has been referred to as 
"the phantom reinforcer" (Palmer, 1978). This study's 
estimate for frequency of use in SS probably 
underrepresents the actual frequency, insofar as many of 
the recurrent formats of SS that provide a pause for 
audience participation appear in song, which were 
omitted from this sample. An example is the well known 
categorization song that begins "one of these things is 
not like the other. . ." and leaves a blank in the song 
for the child to fill with the name of the odd object. 
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Observations of young children viewing in their homes 
indicate that often they do respond (Lemish, in press). 

. Discourse . The emphasis on the here and now is 
evident at the level of narrative type. Of the 10 SS 
bits, 9 were event casts, involving a running narrative or 
conversational interchange about events currently in the 
attention of the teller and the observers. In one of 
these bits, a remembered past event was presented as an 
event cast by means of a flashback to an earlier time. 
The viewer saw the remembered events and interactions, 
with a voice-over narration. This strong reliance on 
event casts is possible because of television's ability to 
transcend temporal constraints. The other SS bit was an 
account, although a rather odd one. It was a parody of a 
commercial, with a speaker "advertising" rain by extolling 
the virtues of rain, accompanied by characters walking 
into the announcer's office setting wearing various rain 
attire . 

All the MR bits were event casts. One started with a 
brief recount of the previous day's events, and two had 
embedded short accounts. 

Comparison of Sesame Street and Mr. Rogers' 
Neighborhood . A series of t tests were conducted on the 
grammar and content variables to investigate possible 
differences between the two programs. Differences were 
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apparent for the following variables: Reversal questions, 
t(16) = 4.183, £ < .001; Wh questions, t(16) = 5.916, 
£, < .001; direct instructions to the viewer, t.(16) = 
2.257, £ < .05. There is a higher proportion of reversal 
questions, fewer Wh questions, and a higher proportion of 
direct instructions to the viewer on MR as compared to SS. 
Overall, the extent of the similarity of dialogue 
characteristics of the two programs is striking, given the 
differences in production techniques and the different 
content emphases. 
Conclusions 

One of children's favorite activities is viewing 
television. Among the most popular programs for young 
children in the United States are the educational programs 
broadcast on public television. Sesame Street and Mr. 
Rogersl Neighborhood. As children view, they experience 
dialogue as well as visual information. Contrary to earlier 
assumptions, the dialogue of these programs is well suited 
to the young viewer, with adjustments similar to those 
evident in adults' speech to young children. The mean 
length of utterance is reduced, the ratio of different 
words to total words is comparable to that of young 
children, sentence structure is simplified, and there is a 
heavy emphasis on the here and now (a majority of present 
tense verbs, a high proportion of utterances about 
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immediately visible topics or referents, and a 
preponderance of event casts as narrative structure). The 
questions used are of the two types previously reported to 
be associated with children's acquisition of auxilaries, 
those of reversals (yes/no questions) and Wh questions. 

Furthermore, there are indications of explicit 
attempts to insure children's comprehension of linguistic 
forms. There are frequpnt instances of linguistic 
emphasis, where targeted linguistic forms are stressed, 
repeated in new linguistic frames, or otherwise emphasized 
in the dialogue. Key terms appear repeatedly. The proper 
names of characters are used consistently near the 
beginning of conversational interactions. In addition, 
both programs avoid complex word forms, such as ones with 
nonliteral meanings or novel forms. 

While the medium does not allow for interaction 
between viewer and television character, there are, 
nevertheless, attempts to elicit responses from the 
viewers. These appear as explicit directions to the 
viewer, a device used in dialogue more by MR than SS, 
although SS often uses songs to do this. 

Overall, the dialogue of children's educational 
television programs provides a model of language form, 
structure and use that is well suited to the young 
viewer's linguistic competencies. Observations of 
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children's responses and comments in the home viewing 
situation indicate that they readily assume that the 
dialogue is meaningful, and that they comprehend what they 
hear (Lemish & Rice, 1984). 

Children's ability to extract from the dialogue 
linguistic information that they apply to their own 
mastery of language remains to be seen. To some extent, 
the same arguments that have been proposed for the 
facilitative effects of motherese can be applied to 
television viewing. On the other hand, there are some 
significant differences between live interactions and the 
viewing circumstances. The major one is that in live 
conversations adults can respond to what a child says, by 
repeating, expanding, or extending a child utterance. 
This feature of semantic contingency has been linked with 
children's language acquisition (e.g.. Snow, 1984; Wells, 
1985). Facilitative effects are attributed to adults' 
provision of linguistic models for what the child is 
trying to express, the content of immediate interest to 
the child. Wells (1985) points out that adult-child 
interactions are embedded in a conversational setting in 
which the two parties are trying to communicate with each 
other. Adults generally do not intend to model 
linguistic forms to children. Expansions are often 
attempts to interpret what the child means to say and to 
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arrive at a mutual understanding of a common topic. Nor 
does the child model his speech on what he hears in any 
sort of straightforward way. The critical features of 
live interactions are joint attention to the same topic, 
mutual comprehension of content, and encouragement for ' 
conversation. According to Wells, the provision of 
child-appropriate language, the linguistic adjustments of 
adults, are secondary consequences of the communicative 
context. The dialogue of children's television programs 
also focuses on successful communication with the child 
viewer. While the TV characters do not follow up on 
topics initiated by the child viewer, the content is 
evidently of interest to children, insofar as it 
maintains their attention. Furthermore, the program 
content is comprehensible. In short, educational " 
programs create an attentive situation to which they then 
respond in a presentation comprehensible to young 
viewers • ^ 

Given the noninteracti ve nature of viewing, language 
learning will depend upon how much children can draw upon 
observational learning, a possibility relatively 
overlooked in the child language literature (cf. Heath's 

would like to thank Catherine Snow for this succinct 
characterization . 
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description of observational circumstances for language 
acquisition, 1983). Attention and comprehension-based 
analyses are surely critical moderators of observational 
learning. Child viewers must draw upon strategies for 
coping with new linguistic information that are 
consistent with the processing demands of the medium. 
One candidate is a wholistic strategy for language learning 
(cf. Peters, 1983), as is apparent in the tendency of 
children to repeat phrases and jingles of commercials. 
Another possibility is that young viewers call upon a 
fast mapping of new linguistic forms, an initial quick 
but superficial grasp of linguistic meanings (cf. Carey, 
1978). These possibilities are amenable to 
investigation. The extent to which young children 
benefit from the "motherese" of Mr^. Rogers and the 
occupants of Sesame Street is a matter worth further 
study . 
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Table 1. 

Grammatical Features Means per Bit for Sesame Street and Mr. Rogers' 
Neighborhood 





MLU 


■^TR 


Total 
Words 


Tntal 
Utterances 


Pres. 


Verbs ^ 
Past 


Future 


Sesame Street 


6.91 


.45 


280 


43.6 


77% 


m 


11% 


Range 


5.89- 


.38- 


144- 


19- 


47- 


0- 


0- 


(N of bits = 


= 10) 8.00 


.59 


449 


69 


92 


45 


19 


Mr. Rogers 


7.42 


.45 


367 


55.5 


68% 


15% 


17% 


Range 


6.23- 


.34- 


127- 


16- 


62- 


0- 


9- 


N of bits = 


8) 8.42 


.56 


766 


128 


86 


29 


33 



Calculated as total number of ins\.ances per category divided by the total 
number of utterances for the grand mean, divided by the number of bits 
for the bit mean. 
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Table 2. 

Content Category Means per Bit for Sesame Street and Mr. Rogers' Neighborhood 



1 2 
Immediacy Emphasis 

Nonl iteral 

Immediate Removed Other Meanings 



Sesame 

Street 58% 33% 9% .94 1% 

Range 21- 9- 3- .60- 0- 

88 68 16 1.16 7 



Mr. Rogers 63% 26% 10% .77 0^ 

Range 11- 10- 0- .57- 0- 

88 70 29 1.22 0 





Novel Words 


Direct Instructions 


Direct Addresses 
(Proper Names) 


Sesame Street 


1% 


3% 


22% 


Range 


0- 


0- 


0- 




4 


13 


43 


Mr. Rogers 


1% 


17% 


1.4% 


Range 


0- 


2- 


0- 




2 


51 


36 



^Calculated as total number of instances per category divided by total 
number of utterances for the grand means, divided by number of bits 
for the bit mean. 

2 

Emphasis is calculated as the total number of occurrences of emphasis 
divided by the total number of utterances. Because it was possible 
to have more than one instance of emphasis per utterance, the 
proportions can exceed 1.00. 
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